3,332 research outputs found

    The use of random projections for the analysis of mass spectrometry imaging data

    Get PDF
    The ‘curse of dimensionality’ imposes fundamental limits on the analysis of the large, information rich datasets that are produced by mass spectrometry imaging. Additionally, such datasets are often too large to be analyzed as a whole and so dimensionality reduction is required before further analysis can be performed. We investigate the use of simple random projections for the dimensionality reduction of mass spectrometry imaging data and examine how they enable efficient and fast segmentation using k-means clustering. The method is computationally efficient and can be implemented such that only one spectrum is needed in memory at any time. We use this technique to reveal histologically significant regions within MALDI images of diseased human liver. Segmentation results achieved following a reduction in the dimensionality of the data by more than 99% (without peak picking) showed that histologic changes due to disease can be automatically visualized from molecular images. [Figure: see text] ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s13361-014-1024-7) contains supplementary material, which is available to authorized users

    Information processing for mass spectrometry imaging

    Get PDF
    Mass Spectrometry Imaging (MSI) is a sensitive analytical tool for detecting and spatially localising thousands of ions generated across intact tissue samples. The datasets produced by MSI are large both in the number of measurements collected and the total data volume, which effectively prohibits manual analysis and interpretation. However, these datasets can provide insights into tissue composition and variation, and can help identify markers of health and disease, so the development of computational methods are required to aid their interpretation. To address the challenges of high dimensional data, randomised methods were explored for making data analysis tractable and were found to provide a powerful set of tools for applying automated analysis to MSI datasets. Random projections provided over 90% dimensionality reduction of MALDI MSI datasets, making them amenable to visualisation by image segmentation. Randomised basis construction was investigated for dimensionality reduction and data compression. Automated data analysis was developed that could be applied data compressed to 1% of its original size, including segmentation and factorisation, providing a direct route to the analysis and interpretation of MSI datasets. Evaluation of these methods alongside established dimensionality reduction pipelines on simulated and real-world datasets showed they could reproducibly extract the chemo-spatial patterns present

    Identification and characterisation of human apoptosis inducing proteins using cell-based transfection microarrays and expression analysis

    Get PDF
    BACKGROUND: Cell-based microarrays were first described by Ziauddin and Sabatini in 2001 as a powerful new approach for performing high throughput screens of gene function. An important application of cell-based microarrays is in screening for proteins that modulate gene networks. To this end, cells are grown over the surface of arrays of RNAi or expression reagents. Cells growing in the immediate vicinity of the arrayed reagents are transfected and the arrays can then be scanned for cells showing localised changes in function. Here we describe the construction of a large-scale microarray using expression plasmids containing human genes, its use in screening for genes that induce apoptosis when over-expressed and the characterisation of a number of these genes by following the transcriptional response of cell cultures during their induction of apoptosis. RESULTS: High-density cell-based arrays were successfully fabricated using 1,959 un-tagged open reading frames (ORFs) taken from the Mammalian Gene Collection (MGC) in mammalian expression vectors. The arrays were then used to screen for genes inducing apoptosis in Human Embryonic Kidney (HEK293T) cells. Using this approach, 10 genes were clearly identified and confirmed to induce apoptosis. Some of these genes have previously been linked to apoptosis, others not. The mechanism of action of three of the 10 genes were then characterised further by following the transcriptional events associated with apoptosis induction using expression profiling microarrays. This data demonstrates a clear pro-apoptotic transcriptional response in cells undergoing apoptosis and also suggests the use of common apoptotic pathways regardless of the nature of the over-expressed protein triggering cell death. CONCLUSION: This study reports the design and use of the first truly large-scale cell-based microarrays for over-expression studies. Ten genes were confirmed to induce apoptosis, some of which were not previously known to possess this activity. Transcriptome analysis on three of the 10 genes demonstrated their use of similar pathways to invoke apoptosis

    QTLRel: an R Package for Genome-wide Association Studies in which Relatedness is a Concern

    No full text
    BACKGROUND Existing software for quantitative trait mapping is either not able to model polygenic variation or does not allow incorporation of more than one genetic variance component. Improperly modeling the genetic relatedness among subjects can result in excessive false positives. We have developed an R package, QTLRel, to enable more flexible modeling of genetic relatedness as well as covariates and non-genetic variance components. RESULTS We have successfully used the package to analyze many datasets, including F₃₄ body weight data that contains 688 individuals genotyped at 3105 SNP markers and identified 11 QTL. It took 295 seconds to estimate variance components and 70 seconds to perform the genome scan on an Linux machine equipped with a 2.40GHz Intel(R) Core(TM)2 Quad CPU. CONCLUSIONS QTLRel provides a toolkit for genome-wide association studies that is capable of calculating genetic incidence matrices from pedigrees, estimating variance components, performing genome scans, incorporating interactive covariates and genetic and non-genetic variance components, as well as other functionalities such as multiple-QTL mapping and genome-wide epistasis.This project was supported by NIH grants R01DA021336, R01MH079103 and R21DA024845

    A Generative Deep Learning Approach to Stochastic Downscaling of Precipitation Forecasts

    Full text link
    Despite continuous improvements, precipitation forecasts are still not as accurate and reliable as those of other meteorological variables. A major contributing factor to this is that several key processes affecting precipitation distribution and intensity occur below the resolved scale of global weather models. Generative adversarial networks (GANs) have been demonstrated by the computer vision community to be successful at super-resolution problems, i.e., learning to add fine-scale structure to coarse images. Leinonen et al. (2020) previously applied a GAN to produce ensembles of reconstructed high-resolution atmospheric fields, given coarsened input data. In this paper, we demonstrate this approach can be extended to the more challenging problem of increasing the accuracy and resolution of comparatively low-resolution input from a weather forecasting model, using high-resolution radar measurements as a "ground truth". The neural network must learn to add resolution and structure whilst accounting for non-negligible forecast error. We show that GANs and VAE-GANs can match the statistical properties of state-of-the-art pointwise post-processing methods whilst creating high-resolution, spatially coherent precipitation maps. Our model compares favourably to the best existing downscaling methods in both pixel-wise and pooled CRPS scores, power spectrum information and rank histograms (used to assess calibration). We test our models and show that they perform in a range of scenarios, including heavy rainfall.Comment: Submitted to JAMES 4/4/2

    Extensive loss of translational genes in the structurally dynamic mitochondrial genome of the angiosperm Silene latifolia

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Mitochondrial gene loss and functional transfer to the nucleus is an ongoing process in many lineages of plants, resulting in substantial variation across species in mitochondrial gene content. The Caryophyllaceae represents one lineage that has experienced a particularly high rate of mitochondrial gene loss relative to other angiosperms.</p> <p>Results</p> <p>In this study, we report the first complete mitochondrial genome sequence from a member of this family, <it>Silene latifolia</it>. The genome can be mapped as a 253,413 bp circle, but its structure is complicated by a large repeated region that is present in 6 copies. Active recombination among these copies produces a suite of alternative genome configurations that appear to be at or near "recombinational equilibrium". The genome contains the fewest genes of any angiosperm mitochondrial genome sequenced to date, with intact copies of only 25 of the 41 protein genes inferred to be present in the common ancestor of angiosperms. As observed more broadly in angiosperms, ribosomal proteins have been especially prone to gene loss in the <it>S. latifolia </it>lineage. The genome has also experienced a major reduction in tRNA gene content, including loss of functional tRNAs of both native and chloroplast origin. Even assuming expanded wobble-pairing rules, the mitochondrial genome can support translation of only 17 of the 61 sense codons, which code for only 9 of the 20 amino acids. In addition, genes encoding 18S and, especially, 5S rRNA exhibit exceptional sequence divergence relative to other plants. Divergence in one region of 18S rRNA appears to be the result of a gene conversion event, in which recombination with a homologous gene of chloroplast origin led to the complete replacement of a helix in this ribosomal RNA.</p> <p>Conclusions</p> <p>These findings suggest a markedly expanded role for nuclear gene products in the translation of mitochondrial genes in <it>S. latifolia </it>and raise the possibility of altered selective constraints operating on the mitochondrial translational apparatus in this lineage.</p

    The “fossilized” mitochondrial genome of Liriodendron tulipifera: ancestral gene content and order, ancestral editing sites, and extraordinarily low mutation rate

    Get PDF
    BACKGROUND: The mitochondrial genomes of flowering plants vary greatly in size, gene content, gene order, mutation rate and level of RNA editing. However, the narrow phylogenetic breadth of available genomic data has limited our ability to reconstruct these traits in the ancestral flowering plant and, therefore, to infer subsequent patterns of evolution across angiosperms. RESULTS: We sequenced the mitochondrial genome of Liriodendron tulipifera, the first from outside the monocots or eudicots. This 553,721 bp mitochondrial genome has evolved remarkably slowly in virtually all respects, with an extraordinarily low genome-wide silent substitution rate, retention of genes frequently lost in other angiosperm lineages, and conservation of ancestral gene clusters. The mitochondrial protein genes in Liriodendron are the most heavily edited of any angiosperm characterized to date. Most of these sites are also edited in various other lineages, which allowed us to polarize losses of editing sites in other parts of the angiosperm phylogeny. Finally, we added comprehensive gene sequence data for two other magnoliids, Magnolia stellata and the more distantly related Calycanthus floridus, to measure rates of sequence evolution in Liriodendron with greater accuracy. The Magnolia genome has evolved at an even lower rate, revealing a roughly 5,000-fold range of synonymous-site divergence among angiosperms whose mitochondrial gene space has been comprehensively sequenced. CONCLUSIONS: Using Liriodendron as a guide, we estimate that the ancestral flowering plant mitochondrial genome contained 41 protein genes, 14 tRNA genes of mitochondrial origin, as many as 7 tRNA genes of chloroplast origin, >700 sites of RNA editing, and some 14 colinear gene clusters. Many of these gene clusters, genes and RNA editing sites have been variously lost in different lineages over the course of the ensuing ∽200 million years of angiosperm evolution

    Association of common variation in the PPARAgene with incident myocardial infarction in individuals with type 2 diabetes: A Go-DARTS study

    Get PDF
    RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are.Background Common variants of the PPARA gene have been found to associate with ischaemic heart disease in non diabetic men. The L162V variant was found to be protective while the C2528G variant increased risk. L162V has also been associated with altered lipid measures. We therefore sought to determine the effect of PPARA gene variation on susceptibility to myocardial infarction in patients with type 2 diabetes. 1810 subjects with type 2 diabetes from the prospective Go-DARTS study were genotyped for the L162V and C2528G variants in the PPARA gene and the association of the variants with incident non-fatal myocardial infarction was examined. Cox's proportional hazards was used to interrogate time to event from recruitment, and linear regression for analysing association of genotype with quantitative clinical traits. Results The V162 allele was associated with decreased risk of non-fatal myocardial infarction (HR = 0.31, 95%CI 0.10–0.93 p = 0.037) whereas the C2528 allele was associated with increased risk (HR = 2.77 95%CI 1.34–5.75 p = 0.006). Similarly V162 was associated with a later mean age of diagnosis with type 2 diabetes and C2582 an earlier age of diagnosis. C2528 was also associated with increased total cholesterol and LDL cholesterol, which did not account for the observed increased risk. Haplotype analysis demonstrated that when both rare variants occurred on the same haplotype the effect of each was abrogated. Conclusion Genetic variation at the PPARA locus is important in determining cardiovascular risk in both male and female patients with diabetes. This genotype associated risk appears to be independent of the effect of these genotypes on lipid profiles and age of diagnosis with diabetes.Published versio

    MICROGRID RESILIENCE ANALYSIS SOFTWARE DEVELOPMENT

    Get PDF
    Military installation microgrids need to be resilient to a variety of potential disruptions (storms, attacks, et cetera). Various metrics for assessing microgrid resilience have been described in literature, and multiple tools for simulating microgrid performance have been constructed; however, it is often left to system owners and maintainers to bring these efforts together to identify and realize effective, efficient improvement strategies. Military microgrid stakeholders have expressed a desire for an integrated, unified platform that provides these multiple capabilities in a coordinated fashion. In support of these endeavors, analysis methods developed by NPS and NAVFAC Expeditionary Warfare Center researchers for measuring microgrid resilience have been integrated into an existing web-based microgrid power flow simulation and distributed energy resource rightsizing software tool. This was achieved by the development of additional functions and methods within the existing software platform code base, and expansion of the application programming interface (API). These API additions enabled access to the new calculation and analysis capabilities, as well as increased control over power flow simulation parameters. These analytical and functional contributions were validated through a design of experiments, including comparison to independently generated data, and factorial analysis.Outstanding ThesisCivilian, Department of the NavyCivilian, Department of the NavyCivilian, Department of the NavyCivilian, Department of the NavyCivilian, Department of the NavyApproved for public release. Distribution is unlimited
    corecore